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I. Introduction 

The overall goal of the No Child Eeft Behind Act (NCEB) of 2001 is to close, by the end 
of the 2013-2014 academic year, “the achievement gap between high- and low- 
performing children, especially the achievement gap between minority and non-minority 
students, and between disadvantaged children and their more advantaged peers” (NCEB, 
2001, Sec. 1001[3]). Under the federal NCEB mandates, adequate yearly progress (AYP) 
targets must be set for the entire period from 2002 to 2014 in order to ensure that all 
students and all schools eventually meet the content and performance standards adopted 
in their respective states. It was within this context that the Hawaii Department of 
Education (HDOE) launched its Hawaii State Assessment (HSA) in spring, 2002. 

The accountability provisions in NCEB clearly refer to two demographic variables 
underlying the current inequity in public education: economic disadvantage and 
race/ethnicity. It is obvious that the essence of accountability, according to the NCEB, is 
accountability for subgroups, particularly subgroups which have been disadvantaged 
historically by their low income and minority status. It is therefore important to 
investigate the extent to which student performance on the 2002 HSA was determined by 
economic disadvantage and minority status, so that the HDOE may have a clear baseline 
picture by which it can judge how well Hawaii’s public schools will be leveling the 
playing field from 2002 up to 2014 to ensure educational equity. 



II. The Hawaii State Assessment (HSA) Reading Test 

The HSA was designed years before the NCEB was authorized in 2001. As a response to 
the public’s demand for accountability and the 1994 Improving America’s Schools Act 
(lASA), the HDOE decided to reform its statewide testing program into a three-tiered 
standards-guided assessment system (Hawaii Department of Education Office of 
Accountability and School Instructional Support / School Renewal Group, 1999). 
Assessment is to be conducted at the classroom, school and state levels in accordance 
with the revised Hawaii Content and Performance Standards, known as HCPS II. Since 
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2001, the state-level assessment has become the primary instrument by which the HDOE 
intends to demonstrate compliance with the NCLB. For the purpose of this study, the 
HSA refers only to the state-level assessment. In spring, 2002, the new HSA reading and 
math tests were administered to grades 3,5,8 and 10. The present study reports analyses 
based on the 2002 reading tests. 

The HSA reading tests are based upon four “strands” of standards: 

1. Range - “Read a range of literary and informative texts for a variety of purposes 
including those students set for themselves.” 

2. Processes - “Develop and use strategies within the reading processes to construct 
meaning.” 

3. Conventions and Skills - “Develop and apply an understanding of the conventions 
of language and texts to construct meaning.” 

4. Response and Rhetoric - “Using individual reflection and group interaction, 
comprehend and respond to texts from a range of stances: personal, critical and 
creative.” (Hawaii Department of Education Office of Curriculum Instruction and 
Student Support / Instructional Services Branch, 2003, pp. 14-16) 

Each reading test consists of two types of items: selected response (multiple-choice) 
items and constructed response (short or extended answer) items that require writing 
(Hawaii Department of Education, 2001). All test items are matched with the second, 
third and fourth strands. Scaled total scores are classified into four performance levels: 
exceeding, meeting, approaching, and well below proficiency. The scaled total of 300 is 
set as the cutoff proficiency score. Any student scoring below 300 is considered to have 
failed to meet the expected proficiency level. 



III. Research Questions 

The objective of the present study is to examine the impact of three demographic 
variables, poverty, ethnicity and gender, on the risk of a student failing to meet the HSA 
proficiency standards in 2002, with the hope that follow-up research in subsequent years 
will point to an appreciable and steady decline in the negative impact of poverty and 
minority status on academic achievement, as compared with the 2002 baseline. The 
variable of gender has been included in the analyses because past research in Hawaii has 
consistently shown a gender difference in favor of girls in both language arts and math 
(e.g., Brandon & Jordan, 1994; Brandon, Newton & Hammond, 1987; Reiss, 2005). No 
reliable information is currently available from the HDOE database about other important 
demographic variables, such as length of residence in an English-speaking country, home 
language use, parent’s educational attainment, family income relative to household size, 
etc. Therefore those variables have to be ignored in this research. 

The present study addresses three specific research questions: 
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1. To what extent is HSA reading performance influenced by gender, poverty or 
ethnicity separately? This set of three univariate analyses provides an initial 
understanding of the impact of each of the demographic variables on the odds 
of a student failing to reach reading proficiency. 

2. Is there a general pattern of the effects due to the three demographic variables 
across the grade levels? With the three predictors incorporated into one single 
predictive model for each grade level, would a generalizable predictive model, 
with or without interaction effects, emerge? 

3. And finally, how accurate are the predictive models with respect to different 
racial/ethnic subgroups? Attention will be directed beyond an overall percent 
of correct classification to two other aspects of predictive accuracy: probability 
of false identification of failure and probability of false identification of pass. 
Such knowledge will help to adjust the understanding of the overall predictive 
model with respect to various racial/ethnic subgroups. 



IV. Method 

The dependent variable in this study is the binary variable of pass/fail (pass = 1, fail = 0). 
The event of failure (0) is modeled in logistic regression. More specifically, it is the log 
odds of failure, i.e., ln(p/(l-p)), that is regressed on to the predictors. The letter p refers to 
a student’s probability (risk) of failure. The ratio between the probability of failure, p, 
and the probability of pass, (1-p), is known as odds. The three independent variables are 
operationally defined below: 

gender (male = 0, female =1) 

low-income status (ineligible for free or reduced price lunch = 0; eligible for free 
or reduced price lunch =1) 

race/ethnicity (East Asian, Filipino, Hawaiian or White, dummy coded with the 
White group designated as the reference group) 

For any statistically significant logistic regression coefficients, their profile likelihood 
odds ratios and the 95% confidence limits are reported (SAS Institute, 1995). 

The data set has 9,257 third graders (75.35% of all third graders who took the HSA), 
9,602 fifth graders (77.01%), 8,043 eighth graders (75.73%), and 6,504 tenth graders 
(71.72%). The student population in Hawaii is diverse to such an extent that all 
racial/ethnic groups, including Whites, are numerical minorities. The four largest 
racial/ethnic groups: East Asian (Japanese, Chinese and Korean), Filipino, Hawaiian, and 
White, constitute approximately three quarters of the student population. Those are the 
racial/ethnic categories that have been historically used in Hawaii educational studies. 



Table 1 
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V. Results and Discussion 



V.l. Research Question One 

The failure rates by gender, low-income status, and ethnicity are reported in Table Two 
below. 



Table 2 



As expected, girls have a significantly lower failure rate than boys in reading across the 
grade levels, with statistically significant odds ratios of 0.73, 0.61, 0.54 and 0.49 for 
grades 3, 5, 8 and 10 respectively. In other words, the odds of failure for girls are 27%, 
39%, 46% and 51% lower than boys at the four grade levels respectively. This single 
predictor model has an adjusted R-square (Nagelkerke, 1991) of 0.74, 0.75, 0.75, and 
0.75 for Grades 3, 5, 8 and 10 respectively. It is not clear what exactly accounts for the 
persistent gender difference because gender can be interpreted as a composite of 
numerous biological, psychological and socio-cultural factors. However, this finding does 
have profound pedagogical implications if the HDOE is to be serious about ensuring that 
all students, boys as well as girls, attain the expected reading proficiency at each grade. 

In the four grades examined, students eligible for free or reduced price lunch are found to 
have significantly higher failure rates than their ineligible peers, which is not at all 
surprising. The heavy and statistically significant odds ratios, 2.74, 2.62, 2.33, 2.09 for 
grades 3,5,8 and 10 respectively, are all against low-income students. Eligibility for free 
or reduced price lunch means more than double the odds of falling below the HSA 
standards. Those odds ratios suggest that poverty has a much stronger effect on academic 
success than gender. The univariate logistic model has an adjusted R-square of 0.75 for 
all the four grades. 

Among the four racial/ethnic groups. East Asian and White have quite similar failure 
rates, which are clearly lower than those of the Eilipino and Hawaiian groups. Compared 
to Whites, Eilipino and Hawaiian students at all grade levels experience significantly 
higher odds of failure, whereas no statistical difference is found between East Asian and 
White students at any of the grade levels except grade three. The odds ratios for the 
Eilipino vs. White contrast are 2.90, 2.40, 2.63, 2.70 for the four grades respectively; the 
odds ratios for the Hawaiian vs. White contrast are even greater, 3.38, 3.44, 3.60 and 3.59 
for the four grades respectively. Hawaiians face nearly 3.5 times the odds of failing 
compared to Whites. The only statistically significant difference found between East 
Asians and Whites is an odds ratio of 0.84 at the third grade, indicating that East Asian 
children actually outperform their White peers. All this points to race/ethnicity as 
possibly the greatest influence among the three predictors - a possibility that is further 
confirmed by the findings resulting from the next research question. The adjusted R- 
square due to race/ethnicity remains 0.75 at all the four grade levels. 
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V.2. Research Question Two 

Given three demographic predictors, the full logistic regression model can have seven 
effects: three main effects, three two-way interaction effects and one three-way 
interaction effect. When the full model was applied to the four grade levels, the three-way 
interaction effect was non- significant in all cases. This finding justified a subsequent 
search for more parsimonious predictive models. It may be in order here to add that this 
finding corroborates the conclusion of no need to consider a possible three-way 
interaction between gender, poverty and ethnicity, reached in several large scale studies 
(N > 1,000) that examined academic performance in math, reading and science (Bali & 
Alvarez, 2003; Derington-Moore, 2003; Gertz, 1999; O’Conner & Miranda, 2002; 

Patton, 2003; Satumelli & Repa, 1995). 

Further examination of the two-interactions revealed no consistent or interpretable 
patterns. So a decision was made to adopt a main-effects-only model for all the grade 
levels. The pattern of effects, in terms of direction, magnitude and accuracy in prediction, 
is similar enough to suggest that there may exist one single underlying model across the 
grade levels. The results are reported in Table Three. 



Table 3.a-d 



The three -predictor model can correctly classify 65.0%, 64.8%, 64.5% and 64.8% of the 
students into the “pass” or “fail” group at grades 3, 5, 8 and 10 respectively. In other 
words, without any consideration of academic capability, roughly 65% of the students’ 
HSA results could be correctly placed. This is clear evidence that demographic variables 
beyond the control of the public education system are potent determinants of academic 
achievement in Hawaii. This demographics-based predictive model works in three ways, 
disadvantaging boys, poor students, and Filipino and Hawaiian students. Conversely, it 
favors girls, high-income students, and students of White or East Asian ancestry. 

A significant gender effect in favor of females is consistent across the grades. Other 
factors being equal, girls’ odds of failure may be 31% lower at grade three, 45% lower at 
grade five, 58% lower at grade eight, and 50% lower at grade ten. Gender appears to have 
a greater impact at the higher rather than lower grades. 

A more powerful determinant than gender is eligibility for free or reduced price lunch. 
This eligibility translates into a 1 10% increase in the odds of failure at the third grade, 
103% increase at the fifth grade, 58% at the eighth grade, and 75% at the tenth grade. 
Unlike the gender effect, its negative impact seems to weaken as the student gets older. 
Nonetheless, the magnitude of the odds ratio far exceeds the corresponding gender- 
related odds ratio at each grade level. 
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The most potent determinant is found to be race/ethnicity. Because poverty and 
race/ethnicity are correlated, there has been a long standing debate as to whether or not 
race/ethnicity is only a proxy for poverty (e.g., Abbot & Joireman, 2001; Harkreader & 
Weathersby, 1998; Williams, 1972). The analyses based on the HSA data show that 
race/ethnicity has a definitive unique effect, in spite of its correlation with the low- 
income status. Furthermore, as far as the contrasts between Whites on one hand and 
Filipinos and Hawaiians on the other are concerned, race/ethnicity seems to have a much 
more drastic influence than poverty. After the effect of poverty is controlled for, Filipino 
students’ odds of failure are 122% higher than the Whites’ at grade five. And that is the 
lowest odds ratio attributable to race/ethnicity. The most dramatic example is that 
Hawaiian students’ odds of failure are 361% of the Whites’ at the eighth grade. Such 
empirical evidence strengthens the argument that race/ethnicity impacts achievement 
over and beyond the effect of the associated variable of poverty (e.g., Bali & Alvarez, 
2004; Brooks-Gunn, Duncan, & Klebanov, 1994; Lubienski, 2001). The three-effect 
logistic regression model has a stable adjusted R-square of 0.75 at all the grade levels 
examined. 

Given the three demographic variables, there are 16 possible combinations at each grade 
level with a wide range of probabilities of failure. The contrasts between the subgroups 
least and mostly likely to fail (East Asian females without free lunch vs. Hawaiian males 
with free lunch) are 0.24 vs. 0.75, 0.24 vs. 0.78, 0.23 vs. 0.79, and 0.24 vs. 0.81 for 
Grades 3, 5, 8 and 10 respectively (Uyeno, Zhang & Chin-Chance, 2005). This is the 
picture the HDOE faced in 2002 as it began the arduous task to ensure all students and all 
schools meet the NCEB mandates by 2014. 

In short, a general logistic model consisting of three main effects is adequately applicable 
to the four grade levels. The model can correctly classify about 65% of the students in 
each grade and maintain a fairly consistent pattern of significant effects due to gender, 
low-income status and ethnicity. Of the three effects, ethnicity appears to be the most 
powerful determinant, followed by low-income status and gender. This hitherto 
undocumented pattern of relative potency is consistent across the four grade levels in 
Hawaii. 

V.3. Research Question 3 

The last part of the study shifts attention to those students who are misclassified by the 
logistical model. Table Eour reports the sensitivity and specificity of the model at each 
grade level. Sensitivity refers to the percentage of true failures identified by the logistic 
model, and specificity refers to the percentage of true successes identified by the model. 
Also included in the table are the probabilities of false failure and false success as 
identified by the logistic model. 



Table 4 
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With the cutoff of predicted probability of failure set at 0.50, the predictive model, with 
all ethnicities/races considered together, shows a sensitivity of 0.64 at grade three, 0.65 at 
grade five, 0.69 at grade eight, and 0.62 at grade ten. Specificities are 0.67, 0.65, 0.61 and 
0.68 for grades three, five, eight and ten respectively. Those indices remain fairly stable 
across the grades, providing further evidence for the feasibility of a general underlying 
logistic model across the grades. 

The misclassified students at each grade level fall into two categories, those who are 
predicted to pass (not fail) but actually failed (“false negatives” henceforth); and those 
who are predicted to fail but actually passed (“false positives” henceforth). Although 
much research has been conducted relating academic performance to demographic 
variables, particularly low-income status and race/ethnicity, probabilities of false 
negatives or positives have not received much attention. In Hawaii, this neglect may be 
partly due to the fact that no viable pass/fail standards existed in public schools for years 
until the NCLB of 2001. In a more broad perspective, while the effects of social, cultural, 
and economic factors on academic attainment are widely accepted, it is rare to find 
carefully thought out empirical research on inaccuracies in inferring from such factors to 
individual achievement within subgroups. The NCLB’s unambiguous requirement of fair 
and clear measures of subgroup performance prompted the third research question. 

The racial/ethnic distribution of the false negatives deviates drastically from the expected 
proportions at each grade level (chi-square = 478, 436, 729 and 522 for grades 3, 5, 8 and 
10 respectively; df = 3, p < 0.001 for all cases). For example, among the third graders, 
37.63% of the 1,693 false negatives are East Asian students (significantly higher than the 
population proportion of 22.21%), and 29.36% are Whites (significantly higher than the 
population proportion of 19.13%). Obviously East Asian and White students in Hawaii’s 
public schools would enjoy a better than deserved academic reputation, were such 
reputation to be based exclusively on the three demographic variables. On the other hand, 
Eilipino and Hawaiian students would be more likely to be disparaged than their East 
Asian and White counterparts. About 14.06% of the negative falses are Eilipinos 
(significantly lower than the population proportion of 31.95%, and 29.36% are Hawaiians 
(significantly lower the population proportion of 19.13). The observed probability of a 
false negative (predicted pass with an actual outcome of failure) being East Asian or 
White is 0.67 as compared to 0.33 for Eilipinos or Hawaiians. The so-called academic 
success of East Asian and White students cannot be accurately interpreted unless more 
research attention has been devoted to the number of false negatives in theoretical or 
statistical models based exclusively on demographics. The over-representation of East 
Asians or Whites (67%, 66%, 75% and 73% for grades three to ten respectively), or 
under-representation of Eilipino or Hawaiians, persists among the false negatives across 
the grades. 

The other side of the story is of course that among the false positives, i.e., predicted 
failure with an actual outcome of pass, it is the Eilipinos and Hawaiians who outnumber 
East Asians or Whites. Eor instance, of the 1,025 false positive tenth graders, 935 
(91.22%) are Eilipinos and Hawaiians. Only 90 (8.78%) are East Asians or Whites. This 
pattern is stable across the grades. The probability of an East Asian or White to pass who 
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is predicted to fail is only 0.10, 0.13, 0.10, and 0.09 for grades three to ten respectively. 
The overall percent of correct classification based upon the demographics does not tell 
the whole story. What is lost is the exciting news about the valiant efforts and personal 
victories of many, many educationally disadvantaged Filipino and Hawaiian students in 
Hawaii’s public schools who manage to beat the heavy odds and meet or exceed the HSA 
proficiency level. Approximately 90% of the 1,543 false positives are Filipino or 
Hawaiian at the third grade; so are 87% of the 1,659 at the fifth grade, 90% of the 1,588 
at the eighth grade, and 91% of the 1,025. 



VI. Conclusion 

The present study is limited by the absence of many other demographic variables that 
might conceivably have contributed to the failure rates on the 2002 HSA reading tests. It 
also faces the methodological challenge of how to include numerous smaller subgroups 
into the analyses. The predicted probabilities of failure used in classifying the students 
into the predicted pass and fail groups may be optimistically biased because the predicted 
results and the actual results are from the same data. Validations using 2003 and 2004 
HSA data are under consideration. 

Nevertheless, this research has provided the HDOE a preliminary overall understanding 
of what roles the major demographic variables of gender, low-income status, and 
race/ethnicity, played, individually and jointly, in determining students’ reading 
performance in the NCLB baseline year of 2002. It has been found that one single main- 
effects-only logistic model is viable, correctly classifying approximately 65% of the 
students into the “pass” or “fail” group at each of the four grade levels examined. If the 
NCLB is to come anywhere near its stated overall objective, logistic regression 
coefficients associated with the demographic variables should all have decreased to a 
value near 0 by 2014 (odds ratio close to 1). Barring that, the HDOE may take heart in 
the hitherto undocumented success story that many educationally disadvantaged Eilipino 
and Hawaiian students, with support from Hawaii’s public education system, have proved 
to be capable of overcoming their odds of failure and reaching the HSA proficiency level. 
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Table 1 



Frequency Distribution of the Data 

Free Lunch? 

(Top Row=No 

Grade Ethnicity Bottom Row=Yes) % of Sampie 


Sample Size 
(per grade) 


Total Usable 
Scores (per grade) 


3 East Asian 


1,661 


0.18 


9,257 


12,285 




395 


0.04 






Filipino 


1,310 


0.14 








1,162 


0.13 






Hawaiian 


1,080 


0.12 








1,878 


0.20 






Caucasian 


1,198 


0.13 








573 


0.06 






5 East Asian 


1,730 


0.18 


9,602 


12,468 




390 


0.04 






Filipino 


1,378 


0.14 








1,191 


0.12 






Hawaiian 


1,173 


0.12 








1,894 


0.20 






Caucasian 


1,305 


0.14 








541 


0.06 






8 East Asian 


1,653 


0.21 


8,043 


10,620 




254 


0.03 






Filipino 


1,329 


0.17 








893 


0.11 






Hawaiian 


1,180 


0.15 








1,296 


0.16 






Caucasian 


1,067 


0.13 








371 


0.05 






1 0 East Asian 


1,606 


0.25 


6,504 


9,068 




185 


0.03 






Filipino 


1,303 


0.20 








564 


0.09 






Hawaiian 


956 


0.15 








695 


0.11 






Caucasian 


983 


0.15 








212 


0.03 









Table 2 



Rates of Failure by Subgroups 





Grade 3 


Grade 5 


Grade 8 


Grade 10 


Study Sample 


50.26 


50.95 


50.04 


50.49 


Gender 


Male 


54.32 


57.02 


59.11 


58.17 


Female 


46.28 


44.70 


41.45 


41.83 


Income Status 


Not receiving free lunch 


39.61 


41.12 


43.70 


45.26 


Receiving free lunch 


64.22 


64.62 


61.83 


54.74 


Ethnicity 


East Aslan 


30.98 


33.96 


33.14 


32.89 


Filipino 


60.68 


57.61 


58.24 


60.69 


Hawaiian 


64.27 


66.03 


64.98 


67.84 


Caucasian 


34.73 


36.13 


34.08 


36.99 




